Example-Based Sense Tagging of Running Chinese Text

نویسندگان

  • Xiang Tong
  • Changning Huang
  • Cheng-ming Guo
چکیده

This paper describes a sense tagging technique for the automatic sense tagging of running Chinese text. The system takes as input running Chinese text, and outputs sense disambiguated text. Whereas previous work (Yarowsky, 1992; Gale, et al. , 1992, 1993) relies heavily on the role of statistics, the present system makes use of Machine Readable/Tractable Dictionaries (Wilks, et al. , 1990; Guo, in press) and an example-based reasoning technique (Nagao, 1984; Sumita, et al. , 1990) to treat novel words, compound words, and phrases found in the input text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Models for Chinese Unknown Word Resolution Dissertation

Word segmentation, part-of-speech (POS) tagging, and sense tagging are important steps in various Chinese natural language processing (CNLP) systems. Unknown words, i.e., words that are not in the dictionary or training data used in a CNLP system, constitute a major challenge for each of these steps. This dissertation is concerned with developing hybrid models that effectively combine statistic...

متن کامل

Supersense tagging for Danish

We describe the creation of a new Danish resource for automated coarse-grained word sense disambiguation of running text (supersense tagging, SST). Based on corpus evidence we expand the sense inventory to incorporate new lexical classes. We add tags for verbal satellites like collocates, particles and reflexive pronouns, to give account for the satellite-framing properties of Danish. Finally, ...

متن کامل

Building Chinese Sense Annotated Corpus with the Help of Software Tools

This paper presents the building procedure of a Chinese sense annotated corpus. A set of software tools is designed to help human annotator to accelerate the annotation speed and keep the consistency. The software tools include 1) a tagger for word segmentation and POS tagging, 2) an annotating interface responsible for the sense describing in the lexicon and sense annotating in the corpus, 3) ...

متن کامل

Combining Character-Based and Subsequence-Based Tagging for Chinese Word Segmentation

Chinese word segmentation is the initial step for Chinese information processing. The performance of Chinese word segmentation has been greatly improved by character-based approaches in recent years. This approach treats Chinese word segmentation as a character-wordposition-tagging problem. With the help of powerful sequence tagging model, character-based method quickly rose as a mainstream tec...

متن کامل

Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based?

Chinese part-of-speech (POS) tagging assigns one POS tag to each word in a Chinese sentence. However, since words are not demarcated in a Chinese sentence, Chinese POS tagging requires word segmentation as a prerequisite. We could perform Chinese POS tagging strictly after word segmentation (one-at-a-time approach), or perform both word segmentation and POS tagging in a combined, single step si...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993